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AFOSR  Project  Final  Summary 
Jason  R.  Marden 


Contract/Grant  Title:  Game  Engineering  A  Multiagent  Systems  Perspective 

Contract/Grant  #:  FA9550-12-1-0359 

Reporting  Period:  1  July  2012  to  30  June  2015 

Awards  this  Period:  1  July  2014  to  30  June  2015 

2015:  Office  of  Naval  Research  Young  Investigator  Award 
2015:  Best  SICON/CST  Best  SIAM  Paper  Prize  for  paper  "Achieving  Pareto 
Optimality  Through  Distributed  Learning,"  SIAM  Journal  on  Control  and 
Optimization,  2014.  This  work  was  completed  under  this  proposal. 

Top  Three  Accomplishments  During  Entire  Proposal: 

(i)  A  central  component  of  a  game  theoretic  design  is  the  assignment  of  objective 
functions  to  the  individual  agents.  The  following  paper  proves  that  generalized 
weighted  Shapley  values  fully  characterize  all  objective  design  methodologies  that 
guarantee  the  existence  of  a  pure  Nash  equilibrium  in  resource  allocation  problems 
with  separable  system  level  objective  functions.  This  result  identifies  the 
computational  complexity  associated  with  objective  design  since  computing  a 
weighted  Shapley  value  is  frequently  intractable. 

R.  Gopalakrishnan,  J.R.  Marden,  and  A.  Wierman,  "Potential  Games  are 
Necessary  to  Ensure  Pure  Nash  Equilibria  in  Cost  Sharing  Games,"  Mathematics 
of  Operations  Research,  Volume  39,  Number  4,  pp.  1252-1296,  2014. 

(ii)  The  goal  in  networked  control  of  multiagent  systems  is  to  derive  desirable 
collective  behavior  through  the  design  of  local  control  algorithms.  Undoubtedly, 
informational  restrictions  to  the  agents  impose  constraints  on  achievable 
performance  guarantees.  One  of  the  most  significant  accomplishments  from  this 
period  is  a  characterization  of  one  such  constraint  with  regards  to  the  efficiency  of 
the  resulting  stable  solutions  for  a  class  of  networked  resource  allocation  problems 
with  submodular  objective  functions.  This  characterization  is  given  in  the  following 
paper: 

J.R.  Marden,  "The  Role  of  Information  in  Distributed  Resource  Allocation,"  IEEE 
Transactions  on  Control  of  Networked  Systems,  2015  (under  review). 

(iii)  The  vast  majority  of  the  literature  in  distributed  learning  focuses  on  attaining 
convergence  to  Nash  equilibria.  However,  it  is  widely  known  that  Nash  equilibria  are 
often  extremely  inefficient  from  a  system-wide  perspective.  Correlated  equilibria, 
on  the  other  hand,  can  often  characterize  collective  behavior  that  is  far  more 
efficient  than  even  the  best  Nash  equilibrium.  However,  previously  there  were  no 
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distributed  learning  algorithms  in  the  existing  literature  that  provide  convergence  to 
specific  correlated  equilibria.  The  following  paper  was  the  first  at  establishing 
distributed  learning  rules  that  converge  in  probability  to  the  most  efficient 
correlated  equilibrium. 

J.R.  Marden,  "Selecting  Efficient  Correlated  Equilibria  Through  Distributed 

Learning,"  Games  and  Economic  Behavior,  2015  (under  review). 

Survey  of  Accomplishments  During  Entire  Proposal: 

This  project  focused  on  the  derivation/analysis  of  distributed  learning  algorithms  for 
attaining  desirable  system-wide  behavior  in  multiagent  systems.  A  summary  of  the 
main  directions  and  contributions  resulting  from  this  work  completed  under  this 
proposal  are  summarized  below: 

(i)  Performance  Tradeoffs  In  Networked  Control  System  with  Informational 
Constraints:  The  goal  in  networked  control  of  multiagent  systems  is  to  derive 
desirable  collective  behavior  through  the  design  of  local  control  algorithms.  As 
highlighted  above,  informational  restrictions  impose  constraints  on  achievable 
performance  guarantees.  One  of  the  most  significant  accomplishments  from  this 
period  is  a  characterization  of  one  such  constraint  with  regards  to  the  efficiency  of 
the  resulting  stable  solutions  for  a  class  of  networked  resource  allocation  problems 
with  submodular  objective  functions.  When  the  agents  have  full  information 
regarding  the  mission  space,  the  efficiency  of  the  resulting  stable  solutions  is 
guaranteed  to  be  within  50%  of  optimal.  However,  when  the  agents  have  only 
localized  information  about  the  mission  space,  which  is  a  common  feature  of  many 
well-studied  control  designs,  the  efficiency  of  the  resulting  stable  solutions  can  be 
1/n  of  optimal,  where  n  is  the  number  of  agents.  Consequently,  in  general  such 
schemes  cannot  guarantee  that  systems  comprised  of  n  agents  can  perform  any 
better  than  a  system  comprised  of  a  single  agent  for  identical  environmental 
conditions. 

The  natural  question  that  emerges  is  what  information  presented  to  the  agents 
could  be  exploited  to  overcome  such  efficiency  guarantees.  In  the  context  of  the 
well-studied  sensor  coverage  problem,  this  work  identifies  how  limited  aggregate 
information  regarding  the  environment  can  overcome  such  efficiency  guarantees.  In 
particular,  when  the  sensors  only  have  a  localized  view  of  the  mission  space,  the 
achievable  performance  guarantee  is  (1/n)  of  optimal.  However,  if  each  sensor  also 
has  access  to  the  "search  value  associated  with  the  worst  performing  sensor"  and 
"the  general  direction  of  the  worst  performing  sensor",  control  algorithms  can  then 
be  designed  that  guarantee  a  performance  that  is  within  (1/2)  of  optimal.  While  the 
derived  results  fall  within  the  context  of  the  well-studied  sensor  coverage  problem, 
the  general  guidelines  should  be  extendable  to  broader  settings  as  well. 

Further,  new  results  also  highlighted  an  inherent  tradeoff  between  desirable  long¬ 
term  efficiency  guarantees  and  the  resulting  convergence  rates  in  multiagent 
systems. 
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See  publications  [J2,  Cl,  C2,  C5]. 


(ii)  Attaining  Efficient  Correlated  Behavior  Though  Distributed  Learning:  The  vast 
majority  of  the  literature  in  distributed  learning  focuses  on  attaining  convergence  to 
Nash  equilibria.  However,  it  is  widely  known  that  Nash  equilibria  are  often  extremely 
inefficient  from  a  system-wide  perspective.  Correlated  equilibria,  on  the  other  hand, 
can  often  characterize  collective  behavior  that  is  far  more  efficient  than  even  the 
best  Nash  equilibrium.  However,  previously  there  were  no  distributed  learning 
algorithms  in  the  existing  literature  that  provide  convergence  to  specific  correlated 
equilibria.  In  this  activity,  we  provide  two  such  algorithms.  The  first  algorithm 
ensures  that  the  behavior  of  the  agents  can  be  characterized  by  deterministic  cycles, 
which  have  an  empirical  frequency  that  is  aligned  with  the  most  efficient  correlated 
equilibrium.  The  second  algorithm  we  propose  in  this  activity  guarantees  that  the 
agents'  collective  joint  strategy  will  constitute  an  efficient  correlated  equilibrium 
with  high  probability.  The  key  to  attaining  this  second  algorithm  involved 
incorporating  a  common  random  signal  into  the  learning  environment. 

The  results  highlighted  in  the  previous  progress  report  focused  on  ensuring  that  the 
empirical  frequency  of  play  was  aligned  with  the  most  efficient  (coarse)  correlated 
equilibrium.  The  results  derived  this  period  extended  such  results  to  ensure  that  the 
day-to-day  collective  play  was  consistent  with  the  most  efficient  coarse  correlated 
equilibria.  Here,  such  randomness  in  day-to-day  joint  policies  is  essential  for  ensuring 
desirable  performance  is  team  scenarios  relevant  to  the  Department  of  Defense, 
e.g.,  team  versus  team  zero  sum  games.  A  key  novelty  here  is  the  introduction  of  a 
common  random  signal  into  the  learning  environment  that  is  exploited  to  attain 
randomized 

See  publications  [J2,  Cl,  C2,  C5]. 

(iii)  Characterizing  the  Impact  of  Adversarial  Interventions  in  Multiagent 
Coordination:  In  a  multi-agent  system,  transitioning  from  a  centralized  to  a 
distributed  decision-making  strategy  can  introduce  vulnerability  to  adversarial 
manipulation.  In  this  work,  we  studied  the  potential  for  adversarial  manipulation  in  a 
class  of  graphical  coordination  games  where  the  adversary  can  pose  as  a  friendly 
agent  in  the  game,  thereby  directly  influencing  the  decision-making  rules  of  a  subset 
of  agents.  The  adversary's  influence  can  cascade  throughout  the  system,  indirectly 
influencing  other  agents'  behavior  and  significantly  impacting  the  emergent 
collective  behavior.  These  preliminary  results  focused  on  characterizing  conditions 
by  which  the  adversary's  local  influence  can  dramatically  impact  the  emergent  global 
behavior,  e.g.,  destabilize  efficient  equilibria.  Furthermore,  preliminary  results 
demonstrate  empirically  that  safeguarding  a  multiagent  system  against  adversarial 
interventions  comes  at  the  expense  of  degrading  the  responsiveness  in  the 
multiagent  system,  e.g.,  convergence  rates. 

See  publications  [C3]. 
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(iv)  Robust  Mechanisms  for  Social  Influence:  Uninfluenced  social  systems  often 
exhibit  suboptimal  performance;  a  common  mitigation  technique  is  to  charge  agents 
specially-designed  taxes,  influencing  the  agents'  choices  and  thereby  bringing 
aggregate  social  behavior  closer  to  optimal.  In  general,  the  efficiency  guaranteed  by 
a  particular  taxation/influencing  methodology  is  limited  both  by  the  quality  of 
information  available  to  the  system-designer  and  the  sophistication  of  the  available 
taxation  methodologies.  If  the  tax-designer  possesses  a  perfect  characterization  of 
the  system,  it  is  often  straightforward  to  design  taxes  that  perfectly  align  agents' 
incentives  with  the  designer's  global  objective.  However,  as  the  quality  of  the 
designer's  information  decreases,  increasingly  sophisticated  methodologies  are 
required  to  achieve  the  same  efficiency  target. 

In  this  direction,  we  offer  a  preliminary  study  on  the  role  of  robust  taxation 
mechanism  to  influence  behavior  in  a  class  of  routing  problem.  More  specifically,  we 
study  the  application  of  taxes  to  a  network-routing  game,  and  we  assume  that  the 
tax-designer  knows  neither  the  network  topology  nor  the  tax-sensitivities  and 
demands  of  the  agents.  We  show  that  it  is  possible  to  design  taxes  that  guarantee 
that  selfish  network  flows  are  arbitrarily  close  to  optimal  flows,  despite  the  fact  that 
agents'  tax-sensitivities  are  unknown  to  us.  We  term  these  taxes  "universal,"  since 
they  enforce  optimal  behavior  in  any  routing  game  without  a  priori  knowledge  of  the 
specific  game  parameters.  In  general,  these  taxes  may  be  arbitrarily  high; 
accordingly,  for  affine  cost  parallel-network  routing  games,  we  explicitly  derive  the 
optimal  bounded  tolls  and  the  best-possible  efficiency  guarantee  as  a  function  of  a 
toll  upper-bound.  Finally,  we  restrict  attention  to  very  simple  fixed-toll 
methodologies  and  show  that  they  are  incapable  of  providing  strong  efficiency 
guarantees  if  the  designer  lacks  good  information  about  either  the  network  topology 
or  the  user  sensitivities. 

Extending  such  results  to  the  domain  of  human-agent  cooperative  systems  is  an 
ongoing  research  focus. 

See  publications  [J6,  C9]. 

(v)  Methodologies  for  Utility  Design  in  Distributed  Engineering  Systems:  A  central 
component  of  a  game  theoretic  design  is  the  assignment  of  objective  functions  to 
the  individual  agents.  The  design/influence  of  agent  objective  functions  for  social 
systems  has  been  studied  extensively  in  the  game  theoretic  literature,  e.g.,  cost 
sharing  problems  and  mechanism  design;  however  the  difference  between  the 
constraints  and  objectives  pertaining  to  social  and  engineering  systems  requires 
looking  at  this  literature  from  a  new  perspective. 

The  core  objective  in  engineering  systems  is  to  establish  a  dynamical  process  that 
converges  to  an  efficient  outcome.  Accordingly,  there  are  several  competing 
objectives  that  a  system  designer  needs  to  consider  when  contemplating  the 
underlying  design  including  the  locality  of  the  agents'  objective  functions,  the 
structure  of  the  resulting  game,  the  existence  and  efficiency  of  equilibria,  among 
many  more.  Here,  our  results  focused  on  the  development  of  such  methodologies 
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for  meeting  the  above  objectives.  A  notable  result  from  this  section,  in  [J7],  proves 
that  generalized  weighted  Shapley  values  fully  characterize  all  objective  design 
methodologies  that  guarantee  the  existence  of  a  pure  Nash  equilibrium  in  resource 
allocation  problems  with  separable  system  level  objective  functions.  This  result 
identifies  the  computational  complexity  associated  with  objective  design  since 
computing  a  weighted  Shapley  value  is  frequently  intractable. 

A  fundamental  problem  that  arises  in  distributed  systems  is  efficiency  loss.  That  is, 
the  system  level  performance  associated  with  stable  solutions  could  potentially  be 
much  worse  than  the  optimal  system  level  performance.  Characterizing  efficiency 
bounds  is  essential  for  providing  performance  guarantees  on  the  system  behavior; 
however,  establishing  such  bounds  is  fundamentally  challenging  as  evidenced  by  the 
lack  of  such  results  in  the  existing  literature  in  distributed  control.  An  opportunity  for 
characterizing  such  bounds  is  to  leverage  off  of  the  significant  body  of  research  in 
the  field  of  algorithmic  game  theory  devoted  to  analyzing  the  inefficiency  of  Nash 
equilibrium  in  distributed  systems,  c.f.,  price  of  anarchy.  Most  of  the  literature 
regarding  price  of  anarchy  is  purely  analytical  with  no  design  component;  hence,  its 
applicability  to  engineering  systems  is  somewhat  limited  in  its  current  state. 
Establishing  a  methodology  that  guarantees  the  existence  of  a  pure  Nash  equilibrium 
in  addition  to  optimizing  the  price  of  anarchy  would  have  profound  implications  for 
multiagent  coordination  in  both  social  and  engineering  systems  by  improving  the 
operational  efficiency  of  such  systems.  The  characterization  highlighted  above 
identifies  all  methodologies  that  guarantee  the  existence  on  a  pure  Nash 
equilibrium;  hence,  this  result  characterizes  the  complete  design  space  that  a  system 
designer  needs  to  consider  when  the  goal  is  to  optimize  the  price  of  anarchy. 
Furthermore,  preliminary  results  in  derive  such  "optimal"  agents'  objective  functions 
for  specific  problem  instantiations,  e.g.,  network  coding  and  submodular  resource 
allocation  problems.  Ongoing  work  is  seeking  to  identify  more  "universal" 
methodologies  for  optimizing  the  price  of  anarchy  in  distributed  engineering 
systems. 

See  publications  [J5,J7,C7]. 
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